CSE408
Brute Force(String Matching,
Closest pair, Convex
hull,Exhaustive,Voronori
diagrams
Lecture # 7&8
Brute Force
A straightforward approach, usually based directly
on the problem’s statement and definitions of
the concepts involved
Brute-Force String Matching
pattern: a string of m characters to search for
text: a (longer) string of n characters to search in
problem: find a substring in the text that matches the pattern
Brute-force algorithm
Step 1 Align pattern at beginning of text
Step 2 Moving from left to right, compare each character of
pattern to the corresponding character in text until
all characters are found to match (successful search); or
a mismatch is detected
Step 3 While pattern is not found and the text is not yet
exhausted, realign pattern one position to the right and
repeat Step 2
Examples of Brute-Force String Matching
1. Pattern: 001011
Text: 10010101101001100101111010
2. Pattern: happy
Text: It is never too late to have a
happy childhood.
Closest-Pair Problem
Find the two closest points in a set of n points
(in the two-dimensional Cartesian plane).
Brute-force algorithm
Compute the distance between every pair of
distinct points
and return the indexes of the points for which
the distance is the smallest.
Closest-Pair Brute-Force Algorithm (cont.)
Efficiency:
How to make it faster?
Θ(n^2) multiplications (or sqrt)
Using divide-and-conquer!
Convex Hull Problem
The convex-hull problem is the problem of
constructing the convex hull for a given set S of n
points
To solve it, we need to find the points that will
serve as the vertices of the polygon in question.
Mathematicians call the vertices of such a
polygon extreme points.
By definition, an extreme point of a convex set is
a point of this set that is not a middle point of
any line segment with endpoints in the set.
how can we solve the convex-hull problem in a
brute-force manner?
Nevertheless, there is a simple but inefficient
algorithm that is based on the following
observation about line segments making up the
boundary of a convex hull
a line segment connecting two points pi and pj of
a set of n points is a part of the convex hull’s
boundary if and only if all the other points of the
set lie on the same side of the straight line
through these two points
Brute-Force Strengths and Weaknesses
Strengths
wide applicability
simplicity
yields reasonable algorithms for some important problems
(e.g., matrix multiplication, sorting, searching, string
matching)
Weaknesses
rarely yields efficient algorithms
some brute-force algorithms are unacceptably slow
not as constructive as some other design techniques
Exhaustive Search
A brute force solution to a problem involving
search for an element with a special property,
usually among combinatorial objects such as
permutations, combinations, or subsets of a set.
Method:
generate a list of all potential solutions to the problem in a
systematic manner (see algorithms in Sec. 5.4)
evaluate potential solutions one by one, disqualifying
infeasible ones and, for an optimization problem, keeping
track of the best one found so far
when search ends, announce the solution(s) found
Example 1: Traveling Salesman Problem
Given n cities with known distances between
each pair, find the shortest tour that passes
through all the cities exactly once before
returning to the starting city
Alternatively: Find shortest Hamiltonian circuit
in a weighted connected graph
Example:
a b
c d
8
2
7
5 3 4
How do we represent a solution (Hamiltonian circuit)?
TSP by Exhaustive Search
Tour Cost
a→bcda 2+3+7+5 = 17
a→bdca 2+4+7+8 = 21
a→cbda 8+3+4+5 = 20
a→cdba 8+7+4+2 = 21
a→dbca 5+4+3+8 = 20
a→dcba 5+7+3+2 = 17
Efficiency:
Θ((n-1)!)
Chapter 5 discusses how to generate permutations fast.
Final Comments on Exhaustive Search
Exhaustive-search algorithms run in a realistic amount
of time only on very small instances
In some cases, there are much better alternatives!
Euler circuits
shortest paths
minimum spanning tree
assignment problem
In many cases, exhaustive search or its variation is the
only known way to get exact solution
The Hungarian method
runs in O(n^3) time.
Vornoi diagram
The partitioning of a plane with points into
convex polygons such that each polygon contains
exactly one generating point and every point in a
given polygon is closer to its generating point
than to any other.
A Voronoi diagram is sometimes also known as a
Dirichlet tessellation. The cells are called Dirichlet
regions, Thiessen polytopes, or Voronoi polygons.
Voronoi diagrams were considered as early at 1644 by
René Descartes and were used by Dirichlet (1850) in
the investigation of positive quadratic forms.
They were also studied by Voronoi (1907), who
extended the investigation of Voronoi diagrams to
higher dimensions.
They find widespread applications in areas such as
computer graphics, epidemiology, geophysics, and
meteorology
CSE408
Divide and Conquer
Lecture #9&10
Divide-and-Conquer
The most-well known algorithm design strategy:
1. Divide instance of problem into two or more smaller
instances
2. Solve smaller instances recursively
3. Obtain solution to original (larger) instance by combining
these solutions
Divide-and-Conquer Technique (cont.)
subproblem 2
of size n/2
subproblem 1
of size n/2
a solution to
subproblem 1
a solution to
the original problem
a solution to
subproblem 2
a problem of size n
(instance)
It general leads to a
recursive algorithm!
Divide-and-Conquer Examples
Sorting: mergesort and quicksort
Binary tree traversals
Binary search (?)
Multiplication of large integers
Matrix multiplication: Strassen’s algorithm
Closest-pair and convex-hull algorithms
General Divide-and-Conquer Recurrence
T(n) = aT(n/b) + f (n) where f(n) (nd), d
0
Master Theorem: If a < bd,T(n) (nd)
If a = bd,T(n) (nd log n)
If a > bd,T(n) (nlog b a )
Note: The same results hold with O instead of .
Examples: T(n) = 4T(n/2) + nT(n) ?
T(n) = 4T(n/2) + n2T(n) ?
T(n) = 4T(n/2) + n3T(n) ?
(n^2)
(n^2log n)
(n^3)
Mergesort
Split array A[0..n-1] into about equal halves and make copies
of each half in arrays B and C
Sort arrays B and C recursively
Merge sorted arrays B and C into array A as follows:
Repeat the following until no elements remain in one of the
arrays:
compare the first elements in the remaining unprocessed
portions of the arrays
copy the smaller of the two into A, while incrementing
the index indicating the unprocessed portion of that array
Once all elements in one of the arrays are processed, copy
the remaining unprocessed elements from the other array
into A.
Pseudocode of Mergesort
Pseudocode of Merge
Time complexity: Θ(p+q) = Θ(n)comparisons
Mergesort Example
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
8 3 2 9 7 1 5 4
3 8 2 9 1 7 4 5
2 3 8 9 1 4 5 7
1 2 3 4 5 7 8 9
The non-recursive
version of Mergesort
starts from merging
single elements into
sorted pairs.
Analysis of Mergesort
All cases have same efficiency: Θ(n log n)
Number of comparisons in the worst case is close to theoretical
minimum for comparison-based sorting:
log2n!nlog2 n -1.44n
Space requirement: Θ(n) (not in-place)
Can be implemented without recursion (bottom-up)
T(n) = 2T(n/2) + Θ(n), T(1) = 0
Quicksort
Select a pivot (partitioning element) here, the first element
Rearrange the list so that all the elements in the first s positions
are smaller than or equal to the pivot and all the elements in the
remaining n-s positions are larger than or equal to the pivot
(see next slide for an algorithm)
Exchange the pivot with the last element in the first (i.e., )
subarray the pivot is now in its final position
Sort the two subarrays recursively
p
A[i]pA[i]p
Partitioning Algorithm
Time complexity: Θ(r-l)comparisons
or i > r
or j = l
<
Quicksort Example
5 3 1 9 8 2 4 7
2 3 1 4 58 9 7
1 23 4 57 89
1234 578 9
1 2 3 4 5 7 8 9
1 2 3 4 5 7 8 9
Analysis of Quicksort
Best case: split in the middle Θ(n log n)
Worst case: sorted array! Θ(n2)
Average case: random arrays Θ(n log n)
Improvements:
better pivot selection: median of three partitioning
switch to insertion sort on small subfiles
elimination of recursion
These combine to 20-25% improvement
Considered the method of choice for internal sorting of large files
(n10000)
T(n) = T(n-1) + Θ(n)
WORST CASE
Best Case
Binary Search
Very efficient algorithm for searching in sorted array:
K
vs
A[0] . . . A[m] . . . A[n-1]
If K = A[m], stop (successful search); otherwise, continue
searching by the same method in A[0..m-1] if K < A[m]
and in A[m+1..n-1] if K > A[m]
l 0; rn-1
while lrdo
m (l+r)/2
if K = A[m] return m
else if K < A[m] r m-1
else l m+1
return -1
Analysis of Binary Search
Time efficiency
worst-case recurrence: Cw (n) = 1 + Cw( n/2), Cw (1) = 1
solution: Cw(n) = log2(n+1)
This is VERY fast: e.g., Cw(106) = 20
Optimal for searching a sorted array
Limitations: must be a sorted array (not linked list)
Bad (degenerate) example of divide-and-conquer
because only one of the sub-instances is solved
Has a continuous counterpart called bisection method for solving equations in
one unknown f(x) = 0 (see Sec. 12.4)
Binary Tree Algorithms
Binary tree is a divide-and-conquer ready structure!
Ex. 1: Classic traversals (preorder, inorder, postorder)
Algorithm Inorder(T)
if T
a a
Inorder(Tleft)b c b c
print(root of T)d e • d e
Inorder(Tright)
Efficiency: Θ(n). Why? Each node is visited/printed once.
Binary Tree Algorithms (cont.)
Ex. 2: Computing the height of a binary tree
TT
L R
h(T) = max{h(TL), h(TR)} + 1 if T
and h() = -1
Efficiency: Θ(n). Why?
Multiplication of Large Integers
Consider the problem of multiplying two (large) n-digit integers
represented by arrays of their digits such as:
A = 12345678901357986429 B = 87654321284820912836
The grade-school algorithm:
a1 a2 an
b1 b2 bn
(d10)d11d12 d1n
(d20)d21d22 d2n
… … … … … … …
(dn0)dn1dn2 dnn
Efficiency: Θ(n2) single-digit multiplications
First Divide-and-Conquer Algorithm
A small example: A B where A = 2135 and B = 4014
A = (21·102+ 35), B = (40 ·102+ 14)
So, A B = (21 ·102+ 35) (40 ·102+ 14)
= 21 40 ·104 + (21 14 + 35 40) ·102+ 35 14
In general, if A = A1A2 and B = B1B2 (where A and B are n-digit,
A1, A2, B1,B2 are n/2-digit numbers),
A B = A1 B1·10n+ (A1 B2 + A2 B1) ·10n/2 + A2 B2
Recurrence for the number of one-digit multiplications M(n):
M(n) = 4M(n/2), M(1) = 1
Solution: M(n) = n2
23*14
Second Divide-and-Conquer Algorithm
A B = A1 B1·10n+ (A1 B2 + A2 B1) ·10n/2 + A2 B2
The idea is to decrease the number of multiplications from 4 to 3:
(A1+ A2) (B1+ B2) = A1 B1+ (A1 B2 + A2 B1) + A2 B2,
I.e., (A1 B2 + A2 B1) = (A1+ A2) (B1+ B2) - A1 B1- A2 B2,
which requires only 3 multiplications at the expense of (4-1) extra
add/sub.
Recurrence for the number of multiplications M(n):
M(n) = 3M(n/2), M(1) = 1
Solution: M(n) = 3log 2n= nlog 23n1.585
What if we count
both multiplications
and additions?
Example of Large-Integer Multiplication
2135 4014
= (21*10^2 + 35) * (40*10^2 + 14)
= (21*40)*10^4 + c1*10^2 + 35*14
where c1 = (21+35)*(40+14) - 21*40 - 35*14, and
21*40 = (2*10 + 1) * (4*10 + 0)
= (2*4)*10^2 + c2*10 + 1*0
where c2 = (2+1)*(4+0) - 2*4 - 1*0, etc.
This process requires 9 digit multiplications as opposed to 16.
Conventional Matrix Multiplication
Brute-force algorithm
c00 c01 a00 a01 b00 b01
= *
c10 c11 a10 a11 b10 b11
a00 * b00 + a01 * b10 a00 * b01 + a01 * b11
=
a10 * b00 + a11 * b10 a10 * b01 + a11 * b11
8 multiplications
4 additions
Efficiency class in general: (n3)
Strassen’s Matrix Multiplication
Strassen’s algorithm for two 2x2 matrices (1969):
c00 c01 a00 a01 b00 b01
= *
c10 c11 a10 a11 b10 b11
m1+ m4- m5 + m7m3 + m5
=
m2+ m4 m1+ m3- m2 + m6
m1= (a00 + a11) *(b00 + b11)
m2= (a10 + a11) *b00
m3= a00 *(b01 - b11)
m4= a11 *(b10 - b00)
m5= (a00 + a01) *b11
m6= (a10 - a00) *(b00 + b01)
m7= (a01 - a11) *(b10 + b11)
7 multiplications
18 additions
Strassen’s Matrix Multiplication
Strassen observed [1969] that the product of two matrices can be
computed in general as follows:
C00 C01 A00 A01 B00 B01
= *
C10 C11 A10 A11 B10 B11
M1+ M4- M5 + M7M3 + M5
=
M2+ M4 M1+ M3- M2 + M6
Formulas for Strassen’s Algorithm
M1= (A00 + A11) (B00 + B11)
M2= (A10 + A11) B00
M3= A00 (B01 - B11)
M4= A11 (B10 - B00)
M5= (A00 + A01) B11
M6= (A10 - A00) (B00 + B01)
M7= (A01 - A11) (B10 + B11)
Analysis of Strassen’s Algorithm
If nis not a power of 2, matrices can be padded with zeros.
Number of multiplications:
M(n) = 7M(n/2), M(1) = 1
Solution: M(n) = 7log 2n= nlog 27 n2.807 vs. n3 of brute-force alg.
Algorithms with better asymptotic efficiency are known but they
are even more complex and not used in practice.
What if we count both
multiplications and additions?
CSE408
Recurrence equations
Lecture #12
Substitution method
1. Guess the form of the solution.
2. Verify by induction.
3. Solve for constants.
The most general method:
3
1. Guess the form of the solution.
2. Verify by induction.
3. Solve for constants.
The most general method:
EXAMPLE: T(n) = 4T(n/2) + n
[Assume that T(1) = Q(1).]
Guess O(n3) . (Prove O and W separately.)
Assume that T(k) ck3 for k < n .
Prove T(n) cn3 by induction.
Substitution method
4
3
33
3
3
))2/((
)2/( )2/(4 )2/(4)(
cn nnccn nnc nnc nnTnT
=
+=
+
+=
desired residual
whenever (c/2)n3 n 0, for
example, if c 2 and n 1.
desired
residual
Example of substitution
Example (continued)
5
We must also handle the initial conditions,
that is, ground the induction with base
cases.
Base: T(n) = Q(1) for all n < n0, where n0 is
a suitable constant.
For 1 n < n0, we have “Q(1) cn3, if we
pick c big enough.
6
We must also handle the initial conditions,
that is, ground the induction with base
cases.
Base: T(n) = Q(1) for all n < n0, where n0 is
a suitable constant.
For 1 n < n0, we have “Q(1) cn3, if we
pick c big enough.
This bound is not tight!
Example (continued)
Recursion-tree method
7
A recursion tree models the costs (time) of a
recursive execution of an algorithm.
The recursion-tree method can be unreliable,
just like any method that uses ellipses (…).
The recursion-tree method promotes
intuition, however.
The recursion tree method is good for
generating guesses for the substitution
method.
8
Solve T(n) = T(n/4) + T(n/2) + n2:
Example of recursion tree
9
T(n)
Solve T(n) = T(n/4) + T(n/2) + n2:
Example of recursion tree
10
T(n/4) T(n/2)
n2
Solve T(n) = T(n/4) + T(n/2) + n2:
Example of recursion tree
11
Solve T(n) = T(n/4) + T(n/2) + n2:
n2
(n/4)2(n/2)2
T(n/16) T(n/8) T(n/8) T(n/4)
Example of recursion tree
12
(n/16)2(n/8)2(n/8)2(n/4)2
(n/4)2(n/2)2
Q(1)
Solve T(n) = T(n/4) + T(n/2) + n2:
n2
Example of recursion tree
13
Solve T(n) = T(n/4) + T(n/2) + n2:
(n/16)2(n/8)2(n/8)2(n/4)2
(n/4)2(n/2)2
Q(1)
2
n
n2
Example of recursion tree
14
Solve T(n) = T(n/4) + T(n/2) + n2:
(n/16)2(n/8)2(n/8)2(n/4)2
(n/4)2(n/2)2
Q(1)
2
16
5n
2
n
n2
Example of recursion tree
15
Solve T(n) = T(n/4) + T(n/2) + n2:
(n/16)2(n/8)2(n/8)2(n/4)2
(n/4)2
Q(1)
2
16
5n
2
n
n2
(n/2)2
Example of recursion tree
Example of recursion tree
16
Solve T(n) = T(n/4) + T(n/2) + n2:
(n/16)2(n/8)2(n/8)2(n/4)2
(n/4)2
Q(1)
2
16
5n
2
n
2
256
25 n
( ) ( )
()
1 3
16
5
2
16
5
16
5
2++++n
Total =
= Q(n2)
n2
(n/2)2
geometric series
17
Recursion Tree
18
Recursion Tree
19
Recursion Tree
20
Recursion Tree
21
Recursion Tree
22
Recursion Tree
23
Recursion Tree
24
Recursion Tree
25
Recursion Tree
26
The master method applies to recurrences
of the form
T(n) = a T(n/b) + f (n) ,
where a 1, b > 1, and f is asymptotically
positive.
The master method
Three common cases
27
Compare f (n) with nlogba:
1. f (n) = O(nlogba e) for some constant e > 0.
f (n) grows polynomially slower than nlogba
(by an ne factor).
Solution: T(n) = Q(nlogba) .
28
Compare f (n) with nlogba:
1. f (n) = O(nlogba e) for some constant e > 0.
f (n) grows polynomially slower than nlogba
(by an ne factor).
Solution: T(n) = Q(nlogba) .
2. f (n) = Q(nlogba lgkn) for some constant k 0.
f (n) and nlogba grow at similar rates.
Solution: T(n) = Q(nlogba lgk+1n) .
Three common cases
29
Compare f (n) with nlogba:
3. f (n) = W(nlogba + e) for some constant e > 0.
f (n) grows polynomially faster than nlogba (by
an ne factor),
and f (n) satisfies the regularity condition
that a f (n/b) c f (n) for some constant c < 1.
Solution: T(n) = Q( f (n) ) .
Three common cases
30
EX. T(n) = 4T(n/2) + n
a = 4, b = 2 nlogba = n2; f (n) = n.
CASE 1: f (n) = O(n2 e) for e = 1.
T(n) = Q(n2).
Examples
31
EX. T(n) = 4T(n/2) + n
a = 4, b = 2 nlogba = n2; f (n) = n.
CASE 1: f (n) = O(n2 e) for e = 1.
T(n) = Q(n2).
EX. T(n) = 4T(n/2) + n2
a = 4, b = 2 nlogba = n2; f (n) = n2.
CASE 2: f (n) = Q(n2lg0n), that is, k = 0.
T(n) = Q(n2lg n).
Examples
32
EX. T(n) = 4T(n/2) + n3
a = 4, b = 2 nlogba = n2; f (n) = n3.
CASE 3: f (n) = W(n2 + e) for e = 1
and 4(n/2)3 cn3 (reg. cond.) for c = 1/2.
T(n) = Q(n3).
Examples
33
EX. T(n) = 4T(n/2) + n3
a = 4, b = 2 nlogba = n2; f (n) = n3.
CASE 3: f (n) = W(n2 + e) for e = 1
and 4(n/2)3 cn3 (reg. cond.) for c = 1/2.
T(n) = Q(n3).
EX. T(n) = 4T(n/2) + n2/lg n
a = 4, b = 2 nlogba = n2; f (n) = n2/lg n.
Master method does not apply. In particular,
for every constant e > 0, we have ne
=
w(lg n).
Examples
CSE408
Closest pair & Convex Hull
Problem, Insertion Sort
Lecture #13
Divide-and-Conquer
The most-well known algorithm design strategy:
1. Divide instance of problem into two or more smaller instances
2. Solve smaller instances recursively
3. Obtain solution to original (larger) instance by combining these
solutions
Divide-and-Conquer Technique (cont.)
subproblem 2
of size n/2
subproblem 1
of size n/2
a solution to
subproblem 1
a solution to
the original problem
a solution to
subproblem 2
a problem of size n
(instance)
It general leads to a
recursive algorithm!
Closest-Pair Problem by Divide-and-Conquer
Step 0 Sort the points by x (list one) and then by y (list two).
Step 1 Divide the points given into two subsets S1 and S2 by a vertical
line x = c so that half the points lie to the left or on the line and half
the points lie to the right or on the line.
Closest Pair by Divide-and-Conquer (cont.)
Step 2 Find recursively the closest pairs for the left and right
subsets.
Step 3 Set d = min{d1, d2}
We can limit our attention to the points in the symmetric
vertical strip of width 2d as possible closest pair. Let C1
and C2 be the subsets of points in the left subset S1 and of
the right subset S2, respectively, that lie in this vertical
strip. The points in C1 and C2 are stored in increasing
order of their y coordinates, taken from the second list.
Step 4 For every point P(x,y) in C1, we inspect points in
C2 that may be closer to P than d. There can be no more
than 6 such points (because d d2)!
Closest Pair by Divide-and-Conquer: Worst Case
The worst case scenario is depicted below:
Efficiency of the Closest-Pair Algorithm
Running time of the algorithm (without sorting) is:
T(n) = 2T(n/2) + M(n), where M(n) Θ(n)
By the Master Theorem (with a = 2, b = 2, d = 1)
T(n) Θ(n log n)
So the total time is Θ(n log n).
Convex hull Algorithm
Convex hull: smallest convex set that includes given points. An O(n^3)
bruteforce time is given in Levitin, Ch 3.
Assume points are sorted by x-coordinate values
Identify extreme points P1 and P2 (leftmost and rightmost)
Compute upper hull recursively:
find point Pmax that is farthest away from line P1P2
compute the upper hull of the points to the left of line P1Pmax
compute the upper hull of the points to the left of line PmaxP2
Compute lower hull in a similar manner
P1
P2
Pmax
Efficiency of Convex hull Algorithm
Finding point farthest away from line P1P2 can be done in linear time
Time efficiency: T(n) = T(x) + T(y) + T(z) + T(v) + O(n),
where x + y + z +v <= n.
worst case: Θ(n2) T(n) = T(n-1) + O(n)
average case: Θ(n) (under reasonable assumptions about
distribution of points given)
If points are not initially sorted by x-coordinate value, this can be
accomplished in O(n log n) time
Several O(n log n) algorithms for convex hull are known
0.56 1.12 1.17 0.322.78 7.42 3.14 7.71
Value 6.21 4.42
Iteration 0: step 0.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
0.56 1.12 1.17 0.322.78 7.42 3.14 7.71
Value 6.21 4.42
Iteration 1: step 0.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
0.56 1.12 1.17 0.322.78 7.42 3.14 7.71
Value 6.21 4.42
Iteration 2: step 0.
Insertion Sort
0.56 7.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.42 1.12 1.17 0.322.78 0.56 3.14 7.71
Value 6.21 4.42
Iteration 2: step 1.
Insertion Sort
0.56 2.78
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.42 1.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 2: step 2.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.42 1.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 3: step 0.
Insertion Sort
1.12 7.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 3: step 1.
Insertion Sort
1.12 2.78
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 3: step 2.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 4: step 0.
Insertion Sort
1.17 7.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
2 3 4 50 1 8 9Array index 6 7
Iteration 4: step 1.
Insertion Sort
1.17 2.78
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 4: step 2.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 5: step 0.
Insertion Sort
0.32 7.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.322.780.56 3.14 7.71
Value 6.21 4.42
Iteration 5: step 1.
Insertion Sort
0.32 2.78
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.17 0.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 5: step 2.
Insertion Sort
0.32 1.17
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 5: step 3.
Insertion Sort
0.32 1.12
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 5: step 4.
Insertion Sort
0.32 0.56
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 5: step 5.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 6: step 0.
6.21 7.42
2 3 4 50 1 8 9Array index 6 7
Insertion Sort
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 6: step 1.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 7: step 0.
Insertion Sort
4.42 7.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.21 4.42
Iteration 7: step 1.
Insertion Sort
4.42 6.21
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 7: step 2.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 8: step 0.
Insertion Sort
3.14 7.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 8: step 1.
Insertion Sort
3.14 6.21
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 8: step 2.
Insertion Sort
3.14 4.42
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 8: step 3.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 9: step 0.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
7.421.12 1.170.32 2.780.56 3.14 7.71
Value 6.214.42
Iteration 10: DONE.
Insertion Sort
2 3 4 50 1 8 9Array index 6 7
Iteration i. Repeatedly swap element i with the one to its left if smaller.
Property. After ith iteration, a[0] through a[i] contain first i+1
elements in ascending order.
CSE408
BFS, DFS, Connected
components
Lecture #14
Graph Traversal
Depth First Search (DFS)
Breadth First Search (BFS)
Topics
Graph Search (traversal)
How do we search a graph?
At a particular vertices, where shall we go next?
Two common framework:
the depth-first search (DFS)
the breadth-first search (BFS) and
In DFS, go as far as possible along a single path until
reach a dead end (a vertex with no edge out or no
neighbor unexplored) then backtrack
In BFS, one explore a graph level by level away
(explore all neighbors first and then move on)
Depth-First Search (DFS)
The basic idea behind this algorithm is that it traverses the
graph using recursion
Go as far as possible until you reach a deadend
Backtrack to the previous path and try the next branch
The graph below, started at node a, would be visited in the
following order: a, b, c, g, h, i, e, d, f, j
d
ab
h i
c
g
e
j
f
DFS: Color Scheme
Vertices initially colored white
Then colored gray when discovered
Then black when finished
DFS: Time Stamps
Discover time d[u]: when u is first
discovered
Finish time f[u]: when backtrack from u
d[u] < f[u]
DFS Example
source
vertex
DFS Example
1 | | |
| | |
| |
source
vertex d f
DFS Example
1 | | |
| | |
2 | |
source
vertex d f
DFS Example
1 | | |
| | 3 |
2 | |
source
vertex d f
DFS Example
1 | | |
| | 3 | 4
2 | |
source
vertex d f
DFS Example
1 | | |
| 5 | 3 | 4
2 | |
source
vertex d f
DFS Example
1 | | |
| 5 | 63 | 4
2 | |
source
vertex d f
DFS Example
1 | | |
| 5 | 63 | 4
2 | 7 |
source
vertex d f
DFS Example
1 | 8 | |
| 5 | 63 | 4
2 | 7 |
source
vertex d f
DFS Example
1 | 8 | |
| 5 | 63 | 4
2 | 7 9 |
source
vertex d f
DFS Example
1 | 8 | |
| 5 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS Example
1 | 8 |11 |
| 5 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS Example
1 |12 8 |11 |
| 5 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS Example
1 |12 8 |11 13|
| 5 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS Example
1 |12 8 |11 13|
14| 5 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS Example
1 |12 8 |11 13|
14|155 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS Example
1 |12 8 |11 13|16
14|155 | 63 | 4
2 | 7 9 |10
source
vertex d f
DFS: Algorithm
DFS(G)
for each vertex u in V,
color[u]=white; p[u]=NIL
time=0;
for each vertex u in V
if (color[u]=white)
DFS-VISIT(u)
DFS-VISIT(u)
color[u]=gray;
time = time + 1;
d[u] = time;
for each v in Adj(u) do
if (color[v] = white)
p[v] = u;
DFS-VISIT(v);
color[u] = black;
time = time + 1; f[u]= time;
DFS: Algorithm (Cont.)
source
vertex
DFS: Complexity Analysis
DFS_VISIT is called exactly once for each vertex
And DFS_VISIT scans all the edges which causes cost of O(E)
Initialization complexity is O(V)
Overall complexity is O(V + E)
DFS: Application
Topological Sort
Strongly Connected Component
Breadth-first Search (BFS)
Search for all vertices that are directly reachable
from the root (called level 1 vertices)
After mark all these vertices, visit all vertices
that are directly reachable from any level 1
vertices (called level 2 vertices), and so on.
In general, level k vertices are directly reachable
from a level k 1 vertices
BFS: the Color Scheme
White vertices have not been discovered
All vertices start out white
Grey vertices are discovered but not fully
explored
They may be adjacent to white vertices
Black vertices are discovered and fully
explored
They are adjacent only to black and gray vertices
Explore vertices by scanning adjacency list of
grey vertices
A
ONM
LKJ
E F G H
DCB
I
P
An Example
A
ONM
LKJ
E F G H
DCB
I
P
0
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
2
2
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
2
2
3 3
3
3
3
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
2
2
3 3
3
3
3
4 4
4
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
2
2
3 3
3
3
3
4 4
4
55
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
2
2
3 3
3
3
3
4 4
4
55
A
ONM
LKJ
E F G H
DCB
I
P
0
11
1
2
2
3 3
3
3
3
4 4
4
55
BFS: Algorithm
BFS(G, s)
For each vertex u in V {s},
color[u] = white;
d[u] = infinity;
p[u] = NIL
color[s] = GRAY; d[s] = 0; p[s] = NIL; Q = empty queue
ENQUEUE(Q,s)
while (Q not empty)
u = DEQUEUE(Q)
for each v Adj[u]
if color[v] = WHITE
then color[v] = GREY
d[v] = d[u] + 1; p[v] = u
ENQUEUE(Q, v);
color[u] = BLACK;
Example
r s t u
v w x y
Example
0
r s t u
v w x y
s
Q:
Example
1
0
1
r s t u
v w x y
w
Q: r
Example
1
0
1
2
2
r s t u
v w x y
r
Q: t x
Example
1
2
0
1
2
2
r s t u
v w x y
Q: t x v
Example
1
2
0
1
2
2
3
r s t u
v w x y
Q: x v u
Example
1
2
0
1
2
2
3
3
r s t u
v w x y
Q: v u y
Example
1
2
0
1
2
2
3
3
r s t u
v w x y
Q: u y
Example
1
2
0
1
2
2
3
3
r s t u
v w x y
Q: y
Example
1
2
0
1
2
2
3
3
r s t u
v w x y
Q: Ø
BFS: Complexity Analysis
Queuing time is O(V) and scanning all
edges requires O(E)
Overhead for initialization is O (V)
So, total running time is O(V+E)
BFS: Application
Shortest path problem
CSE408
Presorting& Balanced
Search Tree
Lecture #15
Transform and Conquer
Algorithms based on the idea of transformation
Transformation stage
Problem instance is modified to be more amenable to solution
Conquering stage
Transformed problem is solved
Major variations are for the transform to perform:
Instance simplification
Different representation
Problem reduction
Presorting
Presorting is an old idea, you sort the data and
that allows you to more easily compute some
answer
Saw this with quickhull, closest point
Some other simple presorting examples
Element Uniqueness
Computing the mode of n numbers
Element Uniqueness
Given a list A of n orderable elements,
determine if there are any duplicates of any
element
Brute Force:
for each x A
for each y {A x}
if x = y return not unique
return unique
Presorting:
Sort A
for i 1 to n-1
if A[i] = A[i+1] return not unique
return unique
Runtime?
Computing a mode
A mode is a value that occurs most often in a list of
numbers
e.g. the mode of [5, 1, 5, 7, 6, 5, 7] is 5
If several different values occur most often any of them can be
considered the mode
“Count Sort” approach: (assumes all values > 0; what if
they aren’t?)
max max(A)
freq[1..max] 0
for each x A
freq[x] +=1
mode freq[1]
for i 2 to max
if freq[i] > freq[mode] mode i
return mode
Runtime?
Presort Computing Mode
Sort A
i 0
modefrequency 0
while i ≤ n-1
runlength 1; runvalue A[i]
while i+runlength ≤ n-1 and A[i+runlength] = runvalue
runlength += 1
if runlength > modefrequency
modefrequency runlength
modevalue runvalue
i+= runlength
return modevalue
AVL Animation Link
https://www.cs.usfca.edu/~galles/visualization/AVLtree.
html
Binary Search Tree - Best Time
All BST operations are O(d), where d is tree
depth
minimum d is for a binary tree with
N nodes
What is the best case tree?
What is the worst case tree?
So, best case running time of BST operations
is O(log N)
Nlogd 2
=
Binary Search Tree - Worst Time
Worst case running time is O(N)
What happens when you Insert elements in
ascending order?
Insert: 2, 4, 6, 8, 10, 12 into an empty BST
Problem: Lack of “balance”:
compare depths of left and right subtree
Unbalanced degenerate tree
Balanced and unbalanced BST
4
2 5
1 3
1
5
2
4
3
7
6
4
2 6
5 71 3
Is this “balanced”?
Approaches to balancing trees
Don't balance
May end up with some nodes very deep
Strict balance
The tree must always be balanced perfectly
Pretty good balance
Only allow a little out of balance
Adjust on access
Self-adjusting
Balancing Binary Search Trees
Many algorithms exist for keeping binary
search trees balanced
Adelson-Velskii and Landis (AVL) trees (height-
balanced trees)
Splay trees and other self-adjusting trees
B-trees and other multiway search trees
Perfect Balance
Want a complete tree after every operation
tree is full except possibly in the lower right
This is expensive
For example, insert 2 in the tree on the left and
then rebuild as a complete tree
Insert 2 &
complete tree
6
4 9
81 5
5
28
6 91 4
AVL - Good but not Perfect Balance
AVL trees are height-balanced binary search
trees
Balance factor of a node
height(left subtree) - height(right subtree)
An AVL tree has balance factor calculated at
every node
For every node, heights of left and right subtree
can differ by no more than 1
Store current heights in each node
Height of an AVL Tree
N(h) = minimum number of nodes in an AVL
tree of height h.
Basis
N(0) = 1, N(1) = 2
Induction
N(h) = N(h-1) + N(h-2) + 1
Solution (recall Fibonacci analysis)
N(h) > h ( 1.62)
h-1 h-2
h
Height of an AVL Tree
N(h) > h ( 1.62)
Suppose we have n nodes in an AVL tree of
height h.
n > N(h) (because N(h) was the minimum)
n > h hence log n > h (relatively well balanced
tree!!)
h < 1.44 log2n (i.e., Find takes O(logn))
Node Heights
1
00
2
0
6
4 9
81 5
1
height of node = h
balance factor = hleft-hright
empty height = -1
0
0
height=2 BF=1-0=1
0
6
4 9
1 5
1
Tree A (AVL) Tree B (AVL)
Node Heights after Insert 7
2
10
3
0
6
4 9
81 5
1
height of node = h
balance factor = hleft-hright
empty height = -1
1
0
2
0
6
4 9
1 5
1
0
7
0
7
balance factor
1-(-1) = 2
-1
Tree A (AVL) Tree B (not AVL)
Insert and Rotation in AVL Trees
Insert operation may cause balance factor to
become 2 or 2 for some node
only nodes on the path from insertion point to
root node have possibly changed in height
So after the Insert, go back up to the root node by
node, updating heights
If a new balance factor (the difference hleft-hright) is
2 or 2, adjust tree by rotation around the node
Single Rotation in an AVL Tree
2
10
2
0
6
4 9
81 5
1
0
7
0
1
0
2
0
6
4
9
8
1 5
1
0
7
Let the node that needs rebalancing be .
There are 4 cases:
Outside Cases (require single rotation) :
1. Insertion into left subtree of left child of .
2. Insertion into right subtree of right child of .
Inside Cases (require double rotation) :
3. Insertion into right subtree of left child of .
4. Insertion into left subtree of right child of .
The rebalancing is performed through four
separate rotation algorithms.
Insertions in AVL Trees
j
k
X Y
Z
Consider a valid
AVL subtree
AVL Insertion: Outside Case
h
hh
j
k
XY
Z
Inserting into X
destroys the AVL
property at node j
AVL Insertion: Outside Case
h
h+1 h
j
k
XY
Z
Do a “right rotation”
AVL Insertion: Outside Case
h
h+1 h
j
k
XY
Z
Do a “right rotation
Single right rotation
h
h+1 h
j
k
XYZ
“Right rotation” done!
(“Left rotation” is mirror
symmetric)
Outside Case Completed
AVL property has been restored!
h
h+1
h
j
k
X Y
Z
AVL Insertion: Inside Case
Consider a valid
AVL subtree
h
h
h
Inserting into Y
destroys the
AVL property
at node j
j
k
XY
Z
AVL Insertion: Inside Case
Does “right rotation”
restore balance?
h
h+1h
j
k
X
Y
Z
“Right rotation”
does not restore
balance… now k is
out of balance
h
h+1
h
AVL Insertion: Inside Case
Consider the structure
of subtree Y… j
k
XY
Z
h
h+1h
AVL Insertion: Inside Case
j
k
X
V
Z
W
i
Y = node i and
subtrees V and W
h
h+1h
h or h-1
AVL Insertion: Inside Case
j
k
X
V
Z
W
i
We will do a left-right
“double rotation” . . .
AVL Insertion: Inside Case
j
k
XV
Z
W
i
left rotation complete
Double rotation : first rotation
j
k
XV
Z
W
i
Now do a right rotation
Double rotation : second
rotation
j
k
XVZ
W
i
Double rotation : second
rotation
right rotation complete
Balance has been
restored
h
hh or h-1
Implementation
balance (1,0,-1)
key
right
left
No need to keep the height; just the difference in height, i.e. the
balance factor; this has to be modified on the path of insertion even if
you don’t perform rotations
Once you have performed a rotation (single or double) you won’t need to
go back up the tree
Single Rotation
RotateFromRight(n : reference node pointer) {
p : node pointer;
p := n.right;
n.right := p.left;
p.left := n;
n := p
}
X
Y Z
n
You also need to
modify the heights or
balance factors of n
and p
Insert
Double Rotation
Implement Double Rotation in two lines.
DoubleRotateFromRight(n : reference node pointer) {
????
}
X
n
V W
Z
Insertion in AVL Trees
Insert at the leaf (as for all BST)
only nodes on the path from insertion point to
root node have possibly changed in height
So after the Insert, go back up to the root node by
node, updating heights
If a new balance factor (the difference hleft-hright) is
2 or 2, adjust tree by rotation around the node
Example of Insertions in an AVL
Tree
1
0
2
20
10 30
25
0
35
0
Insert 5, 40
Example of Insertions in an AVL
Tree
1
0
2
20
10 30
25
1
35
0
5
0
20
10 30
25
1
355
40
0
0
01
2
3
Now Insert 45
Single rotation (outside case)
2
0
3
20
10 30
25
1
35
2
5
0
20
10 30
25
1
405
40
0
0
0
1
2
3
45
Imbalance 35 45
0 0
1
Now Insert 34
Double rotation (inside case)
3
0
3
20
10 30
25
1
40
2
5
0
20
10 35
30
1
405
45
01
2
3
Imbalance
45
0
1
Insertion of 34
35
34
0
0
125 34
0
AVL Tree Deletion
Similar but more complex than insertion
Rotations and double rotations needed to
rebalance
Imbalance may propagate upward so that many
rotations may be needed.
Arguments for AVL trees:
1. Search is O(log N) since AVL trees are always balanced.
2. Insertion and deletions are also O(logn)
3. The height balancing adds no more than a constant factor to the speed of
insertion.
Arguments against using AVL trees:
1. Difficult to program & debug; more space for balance factor.
2. Asymptotically faster but rebalancing costs time.
3. Most large searches are done in database systems on disk and use other
structures (e.g. B-trees).
4. May be OK to have O(N) for a single operation if total run time for many
consecutive operations is fast (e.g. Splay trees).
Pros and Cons of AVL Trees